Rasters are “spatial data models that define space as an array of equally sized cells, arranged in rows and columns, and composed of single or multiple bands” (ESRI, https://desktop.arcgis.com/en/arcmap/latest/manage-data/raster-and-images/what-is-raster-data.htm).
In other words, they’re grids. Each element of that grid can be called a pixel or a cell. The grids are georeferenced, so that we can relate each cell to its location on the face of the Earth.
Raster data is generally either categorical (aka thematic), like land cover data, or continuous, like elevation or temperature data.
Rasters can be stacked on top of each other to create multi-layer raster stack. The layers are often called bands, if we’re dealing with satellite images.
Rasters are very efficient at storing lots of spatial information with less memory, because the location data is embedded in the data format. Unlike vector data, where you have to store the coordinates of each point, with rasters you only need to store the number of rows and columns and the extent of the raster.
On the downside, you have to make your data fit into a gridded system.
It’s free. Open, reproducible science. Huge support community. No carpal tunnel from pointing and clicking. Did I mention free?
There are some things that other platforms (e.g., QGIS, ArcGIS) can do faster or better (like hydrology modeling), but often the speed gains are negligible, and you can always call those other programs from R if you really need them.
First, let’s load all the libraries we’ll be using:
The R package we’ll be using to work with raster data is called terra. Terra is the replacement for the tried and trusty ‘raster’ package; basically, terra is a more powerful and faster version of the raster package, written by the same team. Check out the official website here: https://rspatial.org/spatial-terra/index.html
Terra’s format for rasters is the “SpatRaster” class.
A cool thing about SpatRaster objects is that if the underlying file is too large to bring into R’s memory, the SpatRaster object will just point to the file, allowing us to use or manipulate the data without loading the whole dataset.
# we specify the number of columns and the locations of the corners
x<-rast(ncol=5, nrow=5, xmin=-1000, xmax=1000, ymin=-1000, ymax=1000)
# when we call the object, it gives a summary of its attributes
x
## class : SpatRaster
## dimensions : 5, 5, 1 (nrow, ncol, nlyr)
## resolution : 400, 400 (x, y)
## extent : -1000, 1000, -1000, 1000 (xmin, xmax, ymin, ymax)
## coord. ref. :
str(x)
# There are a huge number of components to the 'SpatRaster' class.
# Let's focus on a few of the most important:
res(x)
# and we can access the extent using the ext() function
ext(x)
# each layer is named
names(x)
# if we want to rename, we can use names()
names(x)<-"MyLayer"
# let's look at the values:
values(x)
## Warning: [readValues] raster has no values
Oh, right, we forgot to give the raster values.
We need n=5*5 values (rows x columns); let’s generate them from a normal distribution using rnorm()
values(x) <- rnorm(5*5, mean=0, sd=10)
plot(x)
text(x)
We can also manually assign values to specific cells using their row and
column indices:
x[1,1]<-100
x[5,1]<-100
plot(x)
text(x)
Assign a value of 100 to the middle cell in the raster (third row, third column), and the top right and bottom right corner cells.
We can also do basic mathematical and logical operations on the raster values; for example, the following sets all cell values that are below 0 to 0.
x[x<0]<- 0
plot(x)
text(x)
Exercise: try to multiply, divide, or conduct other operations on your rasters or on individual cells in your rasters
# hint! make a copy of raster x for your outputs, for example:
x2 <- x*2
plot(x2)
text(x2)
We’ll start with a Digital Elevation Model of Western North America at a 180 m resolution( https://www.sciencebase.gov/catalog/item/542aeff7e4b057766eed287f). We’ve downloaded it to the project folder at “./dem_westernus/dem180_hf/dem180_hf.tif” so that you don’t have to.
# read in a GeoTiff
dem <- rast("./dem_westernus/dem180_hf/dem180_hf.tif")
# if we call the object, we can see the min, max, resolution, and coordinate reference
dem
## class : SpatRaster
## dimensions : 12973, 12618, 1 (nrow, ncol, nlyr)
## resolution : 180, 180 (x, y)
## extent : -2549736, -278496.3, 880978.9, 3216119 (xmin, xmax, ymin, ymax)
## coord. ref. : USA_Contiguous_Albers_Equal_Area_Conic_USGS_version
## source : dem180_hf.tif
## name : dem180_hf
## min value : -83
## max value : 4406
What if we only want the data for a smaller area? We can crop the raster to that smaller region. Let’s do this for Wyoming. First, we need a polygon for Wyoming. We can get that using the geodata package.
USA <- geodata::gadm(country='USA', level=1, path="./") # this downloads all the state boundaries
# we'll select just Wyoming
WY <- USA[USA$NAME_1=="Wyoming",]
plot(dem)
plot(WY, add=T)
Uh… Why didn’t the polygon ‘WY’ plot? There’s no error message!
Are dem and WY in the same coordinate system?
crs(dem)==crs(WY)
## [1] FALSE
No!
So, we’ll project dem into the same resolution as WY.
First, we’ll reduce the resolution to speed up processing - reprojection is computationally intensive.
dem.900 <- aggregate(dem, fact=5) # we use a factor of 5 to change the resolution from 180 to 900 m
plot(dem.900)
# so, what if we want to use a different projection, or have it unprojected (lat/long)?
dem.900.wgs84 <- project(x=dem.900, y=crs(WY))# we could also do: project(x=dem.900, y="+proj=longlat +datum=WGS84")
crs(dem.900.wgs84)==crs(WY)
## [1] TRUE
# Now let's try plotting Wyoming over the elevation raster
plot(dem.900.wgs84)
plot(WY, add=T)
We can use crop() to make a new raster that’s restricted to Wyoming
WY.dem <- crop(dem.900.wgs84, WY)
WY.dem
## class : SpatRaster
## dimensions : 405, 709, 1 (nrow, ncol, nlyr)
## resolution : 0.009876016, 0.009876016 (x, y)
## extent : -111.0511, -104.049, 41.0012, 45.00098 (xmin, xmax, ymin, ymax)
## coord. ref. : +proj=longlat +datum=WGS84 +no_defs
## source : memory
## name : dem180_hf
## min value : 961.7866
## max value : 3958.754
plot(WY.dem)
Part 1: Crop the Digital Elevation Model to Albany County, Wyoming.
Part 2: Find the mean elevation of Albany County.
# hint: use the following code to get the polygon for Albany County
USA.counties <- geodata::gadm(country='USA', level=2, path="./") # this downloads all the county boundaries in the us
# we'll select just Albany County
ALB <- USA.counties[USA.counties$NAME_1=="Wyoming" & USA.counties$NAME_2=="Albany",]
plot(WY.dem)
plot(ALB, add=T)
As we’ve seen, the base plot() function generates some decent-looking maps, but they don’t offer very much control over graphics. Luckily, bringing spatial data into ggplot is easy!
In the past, one had to use helper functions to turn the spatial data into a dataframe (using fortify) or a tibble (using df_spatial) that ggplot could interpret. This can still be a useful approach.
WY_dem_tibb <- ggspatial::df_spatial(WY.dem)
class(WY_dem_tibb)
## [1] "tbl_df" "tbl" "data.frame"
head(WY_dem_tibb)
## # A tibble: 6 × 3
## x y band1
## <dbl> <dbl> <dbl>
## 1 -111. 45.0 2586.
## 2 -111. 45.0 2563.
## 3 -111. 45.0 2510.
## 4 -111. 45.0 2483.
## 5 -111. 45.0 2417.
## 6 -111. 45.0 2359.
# we use geom_raster to plot the raster data. For the aesthetic function aes(),
# x and y refer to the longitude and latitude, respectively, and the values of the
# raster are represented in 'band1', which is the label given it by the df_spatial() function
ggplot() +
geom_raster(data = WY_dem_tibb, aes(x=x, y=y, fill=band1))+
labs(fill = "Elevation")+
theme_minimal()
Several packages have developed helper functions so that you don’t have to convert your SpatRaster or other spatial data class into a dataframe before using it in ggplot. We’re going to use geom_spatraster() from the tidyterra package to directly use the SpatRaster in ggplot.
tidyterra is a new tool for plotting objects from terra in ggplot; see here for more: https://dieghernan.github.io/tidyterra/
Alternatively, there are packages for plotting raster data that have been around for longer, but these don’t fit seamlessly into the ggplot workflow. Examples include the packages tmap and rasterVis: https://oscarperpinan.github.io/rastervis/, https://cran.r-project.org/web/packages/tmap/vignettes/tmap-getstarted.html
The following is modified from the tidyterra page above.
# look Mom no hands! geom_spatraster lets us use WY.dem, the SpatRaster, directly with ggplot
wy.dem.plot <- ggplot() +
geom_spatraster(data = WY.dem)+
scale_fill_whitebox_c() +
coord_sf(expand = FALSE)+
labs(fill = "Elevation")
wy.dem.plot
# to add a scale bar and arrow, we can use annotation_scale
# and annotation_north_arrow from the ggspatial package
wy.dem.plot2 <- wy.dem.plot +
annotation_scale(location = "br")+
annotation_north_arrow(location = "tr", which_north = "true",
style=north_arrow_minimal)
wy.dem.plot2
The functions that start with ‘scale_fill_’ set the color palette and ramp to plot the rasters. There is a whole ecosystem of different ‘scale_fill_’ functions and palettes out there. Here, we can illustrate what happens if we switch to a different palette in scale_fill_whitebox_c, and use theme_mimimal() for a cleaner map:
# We can change to a different palette from scale_fill_whitebox_c()
ggplot() +
geom_spatraster(data = WY.dem)+
scale_fill_whitebox_c(
palette = "muted",
labels = scales::label_number(suffix = " m")
) +
labs(fill = "Elevation")+
theme_minimal()
Part 1: Plot the SpatRaster of Albany County’s elevation (that you made in Exercise 3) using ggplot() and geom_spatraster()
Part 2: Try changing palette = “muted” to a different value. Choose from “atlas”, “high_relief”, “arid”, “soft”, “muted”, “purple”, “viridi”, “gn_yl”, “pi_y_g”, “bl_yl_rd”, and “deep”.
If you’re not familiar with PRISM, you can check out their site at prism.oregonstate.edu/
The following code comes from https://michaelpaulschramm.com/posts/2022-07-22-drought/
## convert to terra
ppt_norm_1 <- pd_to_file(ppt_norm_1)
ppt_norm_1 <- rast(ppt_norm_1)
## change the layer names to something sensible
names(ppt_norm_1)
## [1] "PRISM_ppt_30yr_normal_4kmM3_01_bil" "PRISM_ppt_30yr_normal_4kmM3_02_bil"
## [3] "PRISM_ppt_30yr_normal_4kmM3_03_bil" "PRISM_ppt_30yr_normal_4kmM3_04_bil"
## [5] "PRISM_ppt_30yr_normal_4kmM3_05_bil" "PRISM_ppt_30yr_normal_4kmM3_06_bil"
## [7] "PRISM_ppt_30yr_normal_4kmM3_07_bil" "PRISM_ppt_30yr_normal_4kmM3_08_bil"
## [9] "PRISM_ppt_30yr_normal_4kmM3_09_bil" "PRISM_ppt_30yr_normal_4kmM3_10_bil"
## [11] "PRISM_ppt_30yr_normal_4kmM3_11_bil" "PRISM_ppt_30yr_normal_4kmM3_12_bil"
names(ppt_norm_1) <- month.name[1:12]
names(ppt_norm_1)
## [1] "January" "February" "March" "April" "May" "June"
## [7] "July" "August" "September" "October" "November" "December"
ppt_norm_1
## class : SpatRaster
## dimensions : 621, 1405, 12 (nrow, ncol, nlyr)
## resolution : 0.04166667, 0.04166667 (x, y)
## extent : -125.0208, -66.47917, 24.0625, 49.9375 (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat NAD83
## sources : PRISM_ppt_30yr_normal_4kmM3_01_bil.bil
## PRISM_ppt_30yr_normal_4kmM3_02_bil.bil
## PRISM_ppt_30yr_normal_4kmM3_03_bil.bil
## ... and 9 more source(s)
## names : January, February, March, April, May, June, ...
## min values : 3.5573, 2.6586, 4.5119, 0.7360, 0.1417, 0.0000, ...
## max values : 888.7029, 657.2840, 674.7000, 445.5850, 290.7313, 295.2746, ...
# to plot each layer one after the other, can use animate()
# this works in the R console, but not in Rmarkdown
animate(ppt_norm_1)
We can take the sum of all the layers in ppt_norm_1 to give us the total annual precipitation
precip_yr <- sum(ppt_norm_1)
precip_yr # notice that nlyr is 1
## class : SpatRaster
## dimensions : 621, 1405, 1 (nrow, ncol, nlyr)
## resolution : 0.04166667, 0.04166667 (x, y)
## extent : -125.0208, -66.47917, 24.0625, 49.9375 (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat NAD83
## source : memory
## name : sum
## min value : 46.9398
## max value : 5671.803
plot(precip_yr)
crs(ppt_norm_1)==crs(WY)
## [1] FALSE
WY.2 <- project(WY, crs(ppt_norm_1))
crs(ppt_norm_1)==crs(WY.2)
## [1] TRUE
WY_ppt <- crop(ppt_norm_1, WY.2)
plot(WY_ppt)
scale_fill_scico() is another palette function that lets us use some cool palettes.
## plot
WY_ppt_plot <- ggplot() +
geom_spatraster(data = WY_ppt) +
scale_fill_scico(name = "Precipitation (mm)", palette = "lapaz", direction = -1,
trans = "pseudo_log",
breaks = c(0,10,100,500),
na.value = "transparent") +
facet_wrap(~lyr, ncol = 3, nrow = 4) +
theme_minimal()+
labs(caption = "1991-2020 (30-year) Monthly Normal Precipitation\nSource: PRISM Climate Group (https://www.prism.oregonstate.edu/)")
WY_ppt_plot
Play around with the breaks argument in scale_fill_scico(). What happens when you change the numbers?
Notice how Wyoming is plotted as a rectangle. This is ok, but if we want to have the plot account for the curvature of the Earth, we have to give it information about the coordinate reference system. We can do this with the coord_sf() function, which takes a crs object or code.
WY_ppt_plot_latlong <- WY_ppt_plot +
coord_sf(crs = 2163)
WY_ppt_plot_latlong
Always define your plot size and resolution first, then adjust font and point sizes. This can help with getting the text size and point size of lines on the plot to be appropriate for the size of your plot. More details on this can be found here: https://www.christophenicault.com/post/understand_size_dimension_ggplot2/
PNG is a common raster graphic format:
png("wyomingprecip.png", width=8, height=8, res = 100, units = "in") # the png() command opens a graphic device
WY_ppt_plot_latlong
dev.off() # dev.off() closes the graphic device
## png
## 2
# this will write a .png file in the working directory
Look at your working directory and see what wyomingprecip.png looks like!